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Abstract 

Given a geodesic space ( E,d ), we show that full ordinal knowledge on the metric d - i.e. 
knowledge of the function 

Dd . (w,X,y,z) I y ^-d(ui,x)<d(y,z)i 

determines uniquely - up to a constant factor - the metric d. For a subspace E n of n points of 
E, converging in Hausdorff distance to E, we construct a metric d n on E n , based only on the 
knowledge of Dd on E n and establish a sharp upper bound of the Gromov-Hausdorff distance 
between ( E n ,d n ) and ( E,d ). 


1 Introduction 

Given a set of unknown points that are known to belong to R fc and for which pairwise distance is 
known, it is useful to be able to find an embedding of these points in R fc . Methods to find this 
embedding are known as multi-dimensional scaling (MDS) methods, and are widely used as data 
visualization tools, in particular in social sciences. |Tor52| is often considered as a pioneer paper in 
MDS. 

This method, requiring the knowledge of distance between each pair of points, is sometimes too 
restrictive in practice. It happens that the actual distance is unknown, but ordinal information of 
the distances can be obtained. Namely, for any four points w,x,y,z in the dataset, their distance 
||rc — x||, ||y — z || are not known, but they can be compared: 

x||<||2/—z|| 

is known. It is a typical case in social sciences and there exist methods developed in this context. 
This problem is referred as non-metric MDS or ordinal embedding. 

IShe62aj and |She62bj introduced non-metric MDS techniques, allowing to find an embedding 
of the data in R™ _1 given a data set of n points. |Ivru64| introduced a procedure to obtain the 
best possible representation in a fc-dimensional space, for a given k < n. These techniques are now 
widely used in practical applications, to visualize data. 

The book [YH87j deals with these methods and some applications. 

Theoretical guarantees on these methods has not been studied until recently. Namely, is it 
guaranteed that there exists a unique embedding for the data? Given that the dataset grows up to 
filling a subset of the space R fe , does the embedding of the dataset converges to the limit subset? 
Let us formulate formally the questions. 
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Let / be a function defined on E C R fe onto R fc such that for all w, x,y,z £ R fe , 

Ik - x ll < II y - A\ if and onl y if ll/k) - /k)ll < ll/(y) - /0)II- (i) 

Such functions are said to be isotonic . / is embedding the dataset E into R fc and preserves the 
ordinal information on the distances that is known. Clearly / needs not to be the identity function; 
any similarity function (i.e. such that there exists C > 0 such that for any x,y £ R fe , || f{x) — f(y) || = 
C\\f(x) — f(y) ||) can fit. The first question is then: are similarity functions the only functions that 
satisfy Q? This refers to the uniqueness question. 

Let E n be a set of n points in R fc for which only l||-u;-x||<||j/-zl| is known for any given points 
w, x,y,z £ E n . Does any function / : E n —> R^ that satisfies 0 satisfies that the limit of f(E n ) is 
the limit E of E n (up to a similarity)? This refers to the consistency question. 

|KvL14| provides a positive answer to both uniqueness (up to a similarity) and consistency. 

The rate of convergence of the dataset embedding to its limit is tackled in |Aril5| . when the 
limit set if is a bounded connected open set. It basically states that the rate of convergence of E n 
to E is the same as f(E n ) to f(E) in Hausdorff metric, up to a constant factor that grows with the 
dimension k. Methods developed in |KvL14| and |Aril5j use the vector space structure of R fc . 

The aim of this paper is to provide similar results in non Euclidean spaces. 

This investigation is motivated by the use of such type of information in manifold learning. 
Unweighted fc-nearest neighbor methods are widely used and fits in the framework where only a 
partial ordinal information on distances is known. For instance, the ISOMAP method introduced 
in |TDSL00] aims to learn a non linear manifold from fc-nearest neighbor weighted graph. Little is 
known on what can be inferred from unweighted fc-nearest neighbor graphs. 

While previous results on R fc are non constructive, our investigation provides a way to compute 
a metric given a dataset, for which theoretical guarantees are obtained. 

In order to consider the problem for non Euclidean space, we make the following remark for 
the R fc case. If / is a similarity function then, E and f(E)/C are isometric. Stated differently, E 
and f(E) can be rescaled to have the same diameter, and then be isometric. In particular, their 
Gromov-Hausdorff distance is zero. 

This remark allows us to state the problem in term of isometry or Gromov-Hausdorff distance 
between metric spaces. Let us formulate formally the problem. 

Given a metric space (E, d) for which only the function 

Ed * Xi yi z) I Y I -d(w,x)<d(y,z) 

is known (in other words, the metric itself is unknown but two distances can be compared), is it 
possible to recover the metric dl 

The answer is clearly no when the problem is formulated this way, because multiplying the 
metric by a constant does not change the known function Dd (just like space can be reconstructed 
only up a to similarity in R fc ). More importantly, given a sub-additive positive function l, such that 
l(x) = 0 x = 0, then the composed function l o d is a metric that also gives the same observed 
function: 

Ed — Elod- 

However, one can observe that if (E, d) is a geodesic space, then (E, l o d) is geodesic only if l 
is a linear function (i.e. if f : x cx for some c > 0). Thus, if the space ( E,d ) is known to be 
geodesic, the latter argument fails. 
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The paper falls into the following parts. 

We first show that the result of uniqueness in |KvL14| holds for geodesic spaces, that is Dd 
determines d up to a constant factor. 

Secondly, we present our main result which answers how to built a metric on a finite subspace 
E n of E that is known to converge in Gromov-Hausdorff metric to £, when only Dd is known on 
E n . Sharp bounds of this convergence are proven. 

Then, statistical applications are developed. 

Proofs of the results follows and the paper ends with a short discussion. 


2 Uniqueness of the metric 

In order to set the problem properly, recall the definition of a geodesic space. 

Definition 1. Let ( E,d ) be a complete metric space. If for any x,y £ E, there exists z £ E such 
that 

d(x,z) = d(y,z) = -d(x,y ), 

then ( E,d ) is said to be a geodesic space. And z is called a middle point of (x,y). 

A segment [x,y\ is a subset of E such that there exists a continuous mapping 7 : [0,1] —>■ E such 
that y([ 0 ,1]) = [x,y\ and for all t £ [ 0 , 1 ], 

d(x, 7 (t)) = td(x, y) and d{^{t), y) = (1 - t)d(x, y). 

Our first result can then be stated as following. Metric of geodesic spaces is determined by 
ordinal information on the metric. 

Theorem 2. Let (£l,g?i) and (£ 2 ,^ 2 ) be two complete geodesic spaces such that there exists a 
one-to-one map f such that 

Dd 1 = Dd 2 ofxf, ( 2 ) 

then, there exists c > 0 such that f is an isometry between (Ei,di) and (£ 2 , 0 ^ 2 )- 

Proof. We first show that the result is true when £ is restricted to any segment [uj,x]. 

Let w,x £ £ 1 , then since £1 is geodesic, there exists a middle point m, so that 

d±(w, m) = di(m, x) = —di(w, x). 


Since 

1 = Dd 1 (w, m, to, x) = Dd 2 ofx f(u>, m, m, x) = 1 , and 1 = Dd 1 ( x , to, to, w) = Dd 2 ofx f(x, to, to, w) = 1 
then, 

di{f{w),f{m)) = d 2 (f{m ), /(x)). 

Thus, in order to show that /(to) is a middle point of [f(w), f(x)], is suffices to show that for any 
m' such that /(to') is a middle point of [f(w),f(x)], 

<h(f(w),f{m)) < d 2 (f(w),f(m')). 
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Suppose that 


Mf( w )J(m)) > d 2 (f(w),f(m')), 


then the equality 

0 = Dd^w^m^w.m) = Dd 2 ofxf(w,m,w,m') = 0, 

implies 

di(w,m') < di{w,m). 

Similarly, we can show that 

di(m',x) < di(m,x), 

which contradicts that m is a middle point of [w, x]. 

We thus showed that middle points are mapped to middle points by /. 

Applying this recursively on a segment [w, x], we show that for any t £ [0,1] of the form 



with k,n € N, and ut £ [w,a;] such that d\(w,Ut) = td\(w,x), the following holds 

f(u t ) £ [f(w), f(x)] and d 2 {f (w ), / (u t )) = td 2 (f(w), f(x)). (3) 

Since such t, are dense in [0; 1], the result holds true for any t £ [0; 1] by continuity. Indeed, 
since for a sequence Wt —> w there exists a sequence St —> 0 such that St > t and s t is of the form 
A- with k, n £ N, continuity of / holds using 

di(w,w t ) < di(w,u t ) ==> d 2 (f(w),f(w t )) < d 2 (f(w),f(u t )) = td 2 (f{w),f(x)). 

Thus, we showed that the result holds for any segment (with eventually different constants c). 
Take now w, x,y : z £ Ei and set 


d\ ( w , x) 

° d 2 (f(w),f{x))' 

We want to show that constants c are the same for any other segment [y, z], i.e, i.e.. 

di{y,z) = cd 2 (f(y),f(z)). 

Without loss of generality, we can suppose that di(y,z) < d\(w,x). Thus, there exists u £ [w,x] 
such that di(y, z ) = d±(w, u ) = td\(w, x) for some t £ [0; 1]. This equality also provides 

di(y,z) = tdi(w,x) 

= tcd 2 (f(w), /( x)) by definition of c, 

= cd 2 (f(w), f(u)) using ©, 

= cd 2 (f(y), f(z)) using © and d^y, z) = d 1 (w, u). 

Thus, / is an isometry between {E\, d\) and ( E 2 , cd 2 ). □ 
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3 Construction of the metric 


Now that we know that we can construct - up to a constant factor - a geodesic metric d given Dd, 
how do we build it? 

To give an answer, the problem needs to be properly posed. 

Let E n = {xi, ...,x n } be a subset of a geodesic compact space ( E,d ) of diameter 1. Suppose 
that (E n ) n > i converges to E in Hausdorff metric in ( E, d). Can we build a metric d n on E n so that 
( E n ,d n ) converges to ( E,d ) in Gromov-Hausdorff distance, with d n a function of Dff 

To set the notations, let us recall definitions of Hausdorff and Gromov-Hausdorff metric. 

Definition 3 (Hausdorff and Gromov-Hausdorff metric). Let A, B be two subset of a metric space 
(E,d). The Hausdorff distance between A and B is defined by 

d H (A , B) = inf{e > 0|H c B £ , B c A 6 }, 

where A e = {x £ E; 3a £ A s.t. d(a, x) < e}. 

The Gromov-Hausdorff distance between two metric spaces (E^e) and (f, dp) is defined as 
dcH{E, F) = inf{dff {g{E), h(F))\g : E G,h : F G isometric embeddings and G metric space}. 
More details on these metrics can be found on [BBIOlj . 

3.1 Main results 

The idea of the proof of theorem [2] can be used to construct a consistent pseudo-metric on E n . 

Definition 4 (Pseudo metrics on E n ). Let {E,d) be a complete compact geodesic space, with 
diameter 1. Set E n = (xi,..., x n } C E. For a,b £ E, define - if it exists 

M ab = {z £ E; max(d(o, z),d(b , z)) < d(a, 6)}, 

Mff b = M ab nE n \{a,b}, 

m a b £ argmin{max(d(a, z), d(b, z)); z £ M ab }, 

iKb e argmin{ma x(d(a,z),d(b,z));z £ M™ b }, 


and set Aq = 
m£n a n exist, 

® 3 


(x,y), where d{x,y) = diam(£'„) and then for p > 1 and A™ 




Then, for the largest p such that A™ exists, define c n on A™ x A™ by 


(a”, -,a%) - if all 


c n{off , a”) = I* — j|2 _p . 

and for any p > 1 such that A™ exists and for any u,v £ E, set 

d^ p (u,v) = min{c ra (a, 6); d(a, b) > d(u,v),a,b £ H”} 
df p (u, v) = max{c„(a, b);d(a , b) < d(u, v), a,b £ A p }. 


Finally, set p n = max{p £ N*; A p exists,Ma, b £ A p , d+ p (a, b) = d n p (a, 6)}. 
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Remark 5. Given x,y in a geodesic space, the set of m xy coincides with the set of middle points 
of (x,y). 

Intuitively, the largest A p is longest geodesic path we can "make" from E n , with each point 
being a middle point of its neighbors on A p , and both df p and d~ p define a "metric" by comparing 
distances with the ones on this longest "segment" A p . Then p is chosen so that d+ p and d~ p are 
"precise" (with a high p) and close enough. 

Theorem 6. Let ( E , d) be a complete compact geodesic space, with diameter 1. Set E n = {x ±,..., x n } C 
E. 

Then, for C 0 = 

sup \d(u,v) - d+ (u,v)\ < C 0 d H (E n ,E))(l - \ogd H (E n ,E)) 

u,v{zE n 

sup \d(u,v) - df Pn {u,v)\ <C 0 d H (E n ,E))(l-logd H (E n ,E)) 

u,v^E n 

Corollary 7. Let ( E,d ) be a complete compact geodesic space, with diameter 1, and E n be a finite 
subset of ( E,d). Then, one can construct a metric d n on E n , depending only on D^ such that 

d GH {(E n ,d n ), (E, d)) < 2 C 0 d H (E, E n )( 1 - log (d H {E, E n ))), 


where C 0 = ^. 

Remark 8. This result implies that if E n converges to E in Hausdorff metric, then the constructed 
(E n , d n ) also converge to (E, d) in Gromov-Hausdorff metric. The hypotheses ffE n = n and E n — > 
E in Hausdorff metric implies that E is precompact. Since it is also closed, E is compact. To 
relax that hypothesis, one can assume that E n n B —> E fi B for any closed ball B. In that case, 
the result states pointed Gromov-Hausdorff convergence of (E n , d n ) to ( E,d ). Although, since the 
construction of d n uses the fact that the diameter of ( E,d ) is 1, its construction have to be slightly 
adjusted. 

This result has an extra logarithmic factor compared to the one of [Aril5j . which holds in R fc 
with a constant factor growing with k. It is not clear whether the logarithmic factor is a consequence 
of the method we use, or if it needed to obtained a result independent of k. Benefits of our result 
is that it holds in generic geodesic spaces (E , d) and that a computable way to build the metric d n 
is provided. 


4 Applications to statistics 

Consider now that the points E n = [X^,..., X n } of E are chosen randomly, in a i.i.d. setting. 
Then, if the law of Xi are smooth enough, the set E n will converge to E in Hausdorff metric. The 
following proposition gives a more precise statement. 

Proposition 9. Let ( E,d ) be a geodesic space of diameter 1, such that 

m<)<f 
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for all t > 0, where Af(E,t) denotes the minimal number of balls of radius t to cover E, C is a 
positive constant, and d an integer. Set p a Borel probability measure on ( E , d) such that 



for some c > 0 and any B t , ball of radius t > 0. Set n G N and let E n = { X\ , X n } be the set of 
i.i.d. random variables with common law p. Then, there exists a constant K depending only on c 
and C such that, 



Given this random set E n , and metric-comparison function D on this set, our theorem [S] allows 
us to build a metric d n on E n , that converges to ( E,d ) at a speed we can control in expectation. 

Corollary 10. Let (E, d) be a geodesic space of diameter 1, such that 




for all t > 0, where Af(E,t) denotes the minimal number of balls of radius t to cover E, C is a 
positive constant, and d an integer. Set p a Borel probability measure on ( E, d) such that 



for some c > 0 and any B t , ball of radius t > 0. Set n G N and let E n = {Xi ,..., X n } be the set of 
i.i.d. random variables with common law p. 

Then, one can construct a metric d n on E n only based on the function 


T)d • (.VJ, X , y , Z ) G E n I } ~^-d(w,x)<d(y,z) 


such that there exists a constant K > 0 



5 Proofs 

5.1 Main theorem 

The proof of theorem [6] is based on the following lemmas. 

Lemma 11. In the setting of theorem® denote dn the Hausdorff metric, then, 
Vn > 1 ,Vp > l,Va,6 G A™, | d(a,b) — c n {a,b)\ < 6 pdH(E n , E). 
Lemma 12. In the setting of theorem® 


Pn > {-\og(C 0 d H {E n , E))\ - log (log (e/d H (E n ,E)))) 

[log 2 


(4) 
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Lemma 13. In the settings of theorem O for p < p n and any u,v £ L) n >iE n , 

1. d+ p (u, v ) < d~ p (u, v ) + 2 - p , 

2. d~ p (u, v) < d+ p (u,v). 

Proof of lemma 1771 Set e = dH(E n , i?). 

First step 

Remark [5] states that since 1? is geodesic, for all a,b € E, 


d(a, rriab) V d(m a b,b) 


d{a 1 b) 
2 


Also, by definition of the Hausdorff metric, for all n > 1, there exists m n £ E n such that 
d(m n ,m a b) < £, so that 

. . , . d(a,b) 

d{a , m n ) V d(m n , ft) < —--1- e. 

Taking a, b £ A”, it shows that 


d(a, m" h ) V d(m” b , 6) < ^y^- + e. 

Using, d(a, b) < d(a, m" h ) V d(m" h , 6) + d(a, m" 6 ) A d(m" h , 6), one can show that 


d(a, m” h ) A d(m” h , 6) > ^y^- - e. 

Thus, for all a, b £ A”, 

MK< b )-^y^|< £ . (5) 

Second step 

We want to show recursively on p that for all p > 0, setting A p = (an,..., ai+ 2 p), for all 

1 < i < 2P, 

\d( ai ,a i+1 )-2-P\<(3-2-P)£. 


Triangular inequality and the fact that the diameter of E is 1 show that it is true for p = 0. Suppose 
it holds true for all 0 < p < q. Then, set A™ +1 = (&i,..., &i+2®+i)- Thus, for any odd i (and similarly 
for i even), 6,; + i = m b . b , so that, using @ and the recurrence assumption, 


MU u , 2 ) 

d{bi,bi+i) < ---h £ 

M O —(O+I) 1 fO /O 


Similarly, 1 ) > 2-(9+i) + (3 - 2~( (1+1 '))£. 

So that, for all 1 < i < 2 P , 

| d(ai, a i+ 1 ) - 2~ p \ < 3e. (6) 

Third step 

Inequality © proves the lemma for p = 1. Suppose it is true for all 1 < p < k. Then, take 
a, 6 £ A^ +1 = (ai,..., a 1 _j_ 2 ' t + 1 )- 








• If a, b £ -AJ?, then it is already supposed to be true. 

• If a = a,i £ A% and b = a,j £ A%, with i < j, then aj-i, Oj+i £ so that 

d(a,b) - Cn(a, b) < d(a l , aj- 1 ) - c n (a l , aj- 1 ) + d(aj-i,aj) - c„(aj_i, a^) 

< 6 ke + 3e 

c n (a,b) - d(a,b) < c n (a il aj +1 ) - d(ai,a j+ 1 ) - c n (a j+1 ,a,j) +d(aj,a j+ 1 ) 

< 6 ke + 3e 


• If a, b ^ A%, the same ideas lead to 

|d(a, b) - c n (a, b)\ < 6 (k + l)e, 


which concludes the proof. □ 

Proof of lemma\TR First remark that if A p exists and 

Ma,b £ A p , | d(a,b) - c n (a,b)\ < 2~ (p+1) 

then 

Vo, b E A™,d+ p (a,b) = d~ p {a,b). 

Using lemma [Til and the fact that for any a, b £ E n such that d(a , b) > 2 _p , the set M” h is not empty 
if dniEn, E) < 2“( p+1 ) (as it contains the closest point of E n to m a b), one can show recursively on 
p that A p exists for any n,p such that 6 pdn^En, E) < 2~' J> + 1 ). Thus, lemma fill and the remark 
above imply that if 6 pdH(E n , E) < 2~ l ' p+1 \ then, p n > p. Consequently, using lemma HU ( with 
u = d H (E n ,E),x =plog2,c= ), for C 0 = 


Pn > 


(- \og{C 0 d H {E ni E)) - log (log (e/d H {E n , E)))) 

log 2 


□ 


Proof of lemma II.51 Set n E N* and p < p n and denote (a \,..., U2P+i) = A p . 
1. Take any a^, aj G A p such that dff p = c n (ai , aj) and 

d(at, aj) > d(u, v). 

Then, by definition of d^ p (u,v) (as a minimum), 

d(ai , aj- 1 ) < d(u, v) 


so that 

d n, P ( u ’ v ) - Cnia^aj- 1 ) = d+ p {u,v) - 2~ p 
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2. First, remark that since A p increases with p, d~ p (u,v ) increases with p and df[ p (u, v) decreases 
with p, so that is suffices to show d~ Pri (u,v ) < d^ Pn (u,v). In order to show a contradiction, 
suppose that there exists u,v £ U n >i E n such that d~ Pn {u,v) > d^ Pn (u,v). Then, there 
exists, ai + , dj + , at _, a,j_ £ A pn such that 

Cn(ai_,aj _) = d~ pri {u,v ), 
d{di_ , CLj_ ) < d(u,v), 

Cn{di + ,0,j + ) ^n,p n ( U ) ^0 > 

d(a i+ ,a j+ ) > d(u, v), 


with 


d(cn_,aj_) < d(ai + ,a,j + ). ( 8 ) 

Thus, © gives d£ Prl (ai_, a,_) < d+ Pn (a i+ , a j+ ). 

So, using definitions of d+ p and d“ p (as maximum and minimum), and definition of p n , 
Cn(di_ , a j_) ^ d npn (ai_ , ) = d^ Pn (di_ , a,j_ ) < d^ Pn (cii + , a,j + ) < c n (ai + ,aj + ). 

This contradicts 0, proving that hypothesis d npri (n, w) > d^ jPn ( u i t; ) was wrong. 

□ 


Proof of theorem 0 Set n £ N* and p < p n . Let u, v £ E n . Using lemma ITT1 

cC, p (w, v ) = min{c„(a, 6); d(a, h) > d{u, v),a 1 b£ A p } 

> min{c n (a, 6); c„(a, b) + QpdH{E n ,E) > d(u, w),a,kip} 

> d(n, n) - 6 pd H (E ni E). 


Similarly, 


Thus, lemma fldl implies 


d n p {u, v ) < d(n, v ) + 6 pd H (E n , E). 


d(u, v) - 6 pd H (E n , E) - 2~ p < d+ p (u, n) - 2 _p 

< d~ p (u,v) 

< <C, Pn (u, v ) 

< ^n,p„ («, *>) 

< d+ p (u,v) 

< d~ p (u, v ) + 2 _p < d(u, v) + 6pdH{E n , E) + 2 _p . 


Taking p 


^ (-log^odff^n,^)) - log (log(e/d H (E n , E)))) I 


implies that 6 pdH(E n , E) <2 p 1 , so that 


< p n as in ©, lemma El 
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sup 

u,v€E n 

sup 

U,v£En 


\d(u,v ) - d+ Pn (u,u)| <2 P+1 < AC 0 d H {E n , E))( 1 - logd# (£?„, £)) 
M(w,w) - d“ pn (u,«)| < 2 _p+1 < 4C' 0 d//( J B„,£;))(l - log d H {E n ,E)) 


Lemma 14. Set u £ (0,1] , i€R, and c > 1, such that 


X < log — - log (1 - log(u)), 

. CU 


then, 


cxu < e x . 


□ 


5.2 Corollary 

Proof of corollary Q It suffices to choose the closest metric d n to d+ in the sup sense: 

d n £ argmin < sup \d(u, v ) — dff Pn (u, u)|; d is a metric on E n > . 

\ < u,v€.E n J 

Then, since E n C E , there exists a surjective map f : E >—>- E n such that 

d H ((E n ,d),{E,d)) = sup \d(u,v)-d(f{u),f(v))\ 

u,v£E 

so that 

dGH((E n ,d n ), ( E , d)) < dGH((E n ,d n ), (E n , d)) + dn{E n ,E) 

< sup \d(u,v) - d n {u,v)\ + d H (E n ,E) 

u,v^E n 

<2 sup \d{u,v) - d+ :Pn (u,v)\+d H (E n ,E) 

u,v£E n 

<Cd H (E n ,E))(l-logd H (E n ,E)) 

The argmin does not actually necessarily exists, but any metric close enough satisfies it too. □ 
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5.3 Proposition 

Proof. For t > 0, denotes Af(E,t) by m t . Given balls {Bf) i<i<m t/2 that cover E , 


¥.dH(E n , E) < Edu(E n , E)l^ dH ^ En E ) > t} + ^dn(E n , E)l{d H (E n ,E)<t} 

< W(dH{E n , E) > t) + t 

<p( U P\{Xk£B z }\+t 

\ l<i<m t /2 1 <fc<n / 

* e n glog(l -n(B k )) _j_ £ 

l<i<m £ /2 

< m t / 2 e _ "H +1 

< tSL e -™t*IC + t 

~ t d 


Choosing t = ( CC 1 ^) Isslnl) 1/d i ea ds to 


EdH{E n , E) < 


2 d cn 


< K 


(1 + 1/d) logn 
/logn\ 1//d 


,-(1+1/d.) logn 


C(l + l/d) log(n) \ 1/d 


\ n 


■ 


□ 


6 Conclusion 

We have shown that ordinal information on the metric of a geodesic space ( E , d) is enough to recover 
the full metric. Also, given a sample E n of the geodesic space E , and the ordinal information on 
that sample, a metric d n can be built in such a way that the sample ( E n ,d n ) equipped with this 
metric is as close, in Gromov-Hausdorff metric, to the geodesic space (E,d) as the sample (E n ,d) 
equipped with the true metric, up to a logarithmic factor. 

This allows to quantify the information of the full ordinal information on the metric has com¬ 
pared to the metric itself. It is enough to recover the metric sharply (i.e. up to a log factor). An 
interesting question is whether a weaker ordinal information would be as efficient. For instance, 
knowing only D on quadruple (w, x , y , z) of the form (x, y , x, z) would be useful. It has already 
been solved on |KvL14j on R d that this weaker notion of ordinal information is enough to recover 
the metric, but rates of convergence or sharp bounds are still unknown. 

An important question left open, is then how much information is required to recover the metric. 
Is it unweighted k- nearest neighbors graph enough? 
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